Search CORE

61 research outputs found

Solving Factored MDPs with Hybrid State and Action Variables

Author: Guestrin C.
Hauskrecht M.
Kveton B.
Publication venue: 'AI Access Foundation'
Publication date: 30/09/2011
Field of study

Efficient representations and solutions for large decision problems with continuous and discrete variables are among the most important challenges faced by the designers of automated decision support systems. In this paper, we describe a novel hybrid factored Markov decision process (MDP) model that allows for a compact representation of these problems, and a new hybrid approximate linear programming (HALP) framework that permits their efficient solutions. The central idea of HALP is to approximate the optimal value function by a linear combination of basis functions and optimize its weights by linear programming. We analyze both theoretical and computational aspects of this approach, and demonstrate its scale-up potential on several hybrid optimization problems

arXiv.org e-Print Archive

Crossref

Near-optimal sensor placements: maximizing information while minimizing communication cost

Author: A. Gupta
A. Krause
C. Guestrin
J. Kleinberg
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

Crossref

Human–agent collaboration for disaster response

Author: A Chapman
A Monares
AG Barto
C Boutilier
C Boutilier
C Guestrin
C Guestrin
CE Rasmussen
Chris Greenhalgh
DS Bernstein
DV Pynadath
Feng Wu
G Convertino
GI Hawe
GJN Cooke
H Kitano
J Drury
J Searle
JM Bradshaw
Joel E. Fischer
L Kocsis
M Tambe
MA Khan
Mausam
Nicholas R. Jennings
NR Jennings
P Auer
P Scerri
R Chen
S Benford
S Proper
S Reece
Sarvapali D. Ramchurn
SD Ramchurn
SP Simonović
Stephen Roberts
Steve Reece
TL Lenox
Tom Rodden
Wenchao Jiang
ZO Toups
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/02/2015
Field of study

In the aftermath of major disasters, first responders are typically overwhelmed with large numbers of, spatially distributed, search and rescue tasks, each with their own requirements. Moreover, responders have to operate in highly uncertain and dynamic environments where new tasks may appear and hazards may be spreading across the disaster space. Hence, rescue missions may need to be re-planned as new information comes in, tasks are completed, or new hazards are discovered. Finding an optimal allocation of resources to complete all the tasks is a major computational challenge. In this paper, we use decision theoretic techniques to solve the task allocation problem posed by emergency response planning and then deploy our solution as part of an agent-based planning tool in real-world field trials. By so doing, we are able to study the interactional issues that arise when humans are guided by an agent. Specifically, we develop an algorithm, based on a multi-agent Markov decision process representation of the task allocation problem and show that it outperforms standard baseline solutions. We then integrate the algorithm into a planning agent that responds to requests for tasks from participants in a mixed-reality location-based game, called AtomicOrchid, that simulates disaster response settings in the real-world. We then run a number of trials of our planning agent and compare it against a purely human driven system. Our analysis of these trials show that human commanders adapt to the planning agent by taking on a more supervisory role and that, by providing humans with the flexibility of requesting plans from the agent, allows them to perform more tasks more efficiently than using purely human interactions to allocate tasks. We also discuss how such flexibility could lead to poor performance if left unchecked

Nottingham ePrints

Nottingham eTheses

Southampton (e-Prints Soton)

Crossref

Repository@Nottingham

Spiral - Imperial College Digital Repository

Feature selection for chemical sensor arrays using mutual information

Author: A Kraskov
A Krause
A Rakotomamonjy
A Vergara
A Vergara
A Vergara
A Vergara
Amalia Z. Berna
AZ Berna
B Nelson
B Raman
C Cortes
C Guestrin
CC Chang
CE Shannon
E Llobet
H Dacres
H Koinuma
H Peng
H Zheng
I Guyon
I Rodriguez-Lujan
I Rodriguez-Lujan
IJ Myung
J Fonollosa
J Gardner
James P. Brody
Joseph T. Lizier
L Breiman
L Olsson
M Aleixandre
M Pardo
M Pardo
Mikhail Prokopenko
MK Muezzinoglu
N Friedman
R Battiti
R Binions
S Marco
S Martínez
S Pashami
Stephen C. Trowell
T Nowotny
TC Pearce
Thomas Nowotny
TM Cover
X. Rosalind Wang
XR Wang
Y Saeys
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2014
Field of study

We address the problem of feature selection for classifying a diverse set of chemicals using an array of metal oxide sensors. Our aim is to evaluate a filter approach to feature selection with reference to previous work, which used a wrapper approach on the same data set, and established best features and upper bounds on classification performance. We selected feature sets that exhibit the maximal mutual information with the identity of the chemicals. The selected features closely match those found to perform well in the previous study using a wrapper approach to conduct an exhaustive search of all permitted feature combinations. By comparing the classification performance of support vector machines (using features selected by mutual information) with the performance observed in the previous study, we found that while our approach does not always give the maximum possible classification performance, it always selects features that achieve classification performance approaching the optimum obtained by exhaustive search. We performed further classification using the selected feature set with some common classifiers and found that, for the selected features, Bayesian Networks gave the best performance. Finally, we compared the observed classification performances with the performance of classifiers using randomly selected features. We found that the selected features consistently outperformed randomly selected features for all tested classifiers. The mutual information filter approach is therefore a computationally efficient method for selecting near optimal features for chemical sensor arrays

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Western Sydney ResearchDirect

Sussex Research Online

FigShare

Can bounded and self-interested agents be teammates? Application to planning in ad hoc teams

Author: A Brandenburger
B Goodwine
C Boutilier
C Camerer
C Guestrin
CF Camerer
D Koller
DS Bernstein
DV Pynadath
E Kalai
GW Brown
I Gilboa
J Mertens
JA Tatman
JC Harsanyi
K Binmore
L Panait
M Bowling
Muthukumaran Chandrasekaran
P Doshi
P Doshi
P Gmytrasiewicz
Prashant Doshi
R Nair
R Wageman
RJ Aumann
S Seuken
Y Zeng
Yifeng Zeng
Yingke Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 23/11/2016
Field of study

Planning for ad hoc teamwork is challenging because it involves agents collaborating without any prior coordination or communication. The focus is on principled methods for a single agent to cooperate with others. This motivates investigating the ad hoc teamwork problem in the context of self-interested decision-making frameworks. Agents engaged in individual decision making in multiagent settings face the task of having to reason about other agents’ actions, which may in turn involve reasoning about others. An established approximation that operationalizes this approach is to bound the infinite nesting from below by introducing level 0 models. For the purposes of this study, individual, self-interested decision making in multiagent settings is modeled using interactive dynamic influence diagrams (I-DID). These are graphical models with the benefit that they naturally offer a factored representation of the problem, allowing agents to ascribe dynamic models to others and reason about them. We demonstrate that an implication of bounded, finitely-nested reasoning by a self-interested agent is that we may not obtain optimal team solutions in cooperative settings, if it is part of a team. We address this limitation by including models at level 0 whose solutions involve reinforcement learning. We show how the learning is integrated into planning in the context of I-DIDs. This facilitates optimal teammate behavior, and we demonstrate its applicability to ad hoc teamwork on several problem domains and configurations

Northumbria Research Link

Crossref

Teeside University's Research Repository

Robust sensor placements at informative and communication-efficient locations

Author: Andreas Krause
Anupam Gupta
Carlos Guestrin
Cerpa A.
Chekuri C.
Deshpande A.
Ergen S. C.
Fujito T.
Funke S.
Gupta A.
Johnson D. S.
Jon Kleinberg
Kar K.
Krause A.
Meliou A.
Minoux M.
Rappaport T.
Singh A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Behavioral Hierarchy: Exploration and Representation

Author: A. G. Barto
A. G. Barto
A. Jonsson
A. Jonsson
A. McGovern
A. Newell
B. Bakker
B. C. Silva da
B. Digney
B. Hengst
C. Boutilier
C. Guestrin
D. A. Waterman
D. Heckerman
D. W. Schneider
E. D. Sacerdoti
G. A. Miller
G. J. Tesauro
G. Konidaris
G. Konidaris
G. Konidaris
G. Konidaris
G. Konidaris
G. Konidaris
H. A. Simon
H. A. Simon
H. Seijen van
H. Steck
I. Menache
J. Gibson
J. Mugan
J. Pearl
J. R. Anderson
J. Schmidhuber
K. Murphy
K. S. Lashley
L. Torrey
M. E. Taylor
M. E. Taylor
M. Huber
M. M. Botvinick
M. M. Botvinick
M. Pickett
N. Friedman
N. Mehta
P. Langley
R. Alur
R. E. Bellman
R. E. Fikes
R. E. Korf
R. M. Ryan
R. Parr
R. R. Burridge
R. S. Sutton
R. S. Sutton
R. Tedrake
R. Tedrake
R. W. White
S. B. Thrun
S. Hart
S. Mahadevan
S. Mannor
S. Singh
S. Tong
T. G. Dietterich
T. G. Dietterich
T. L. Dean
W. Buntine
W. Callebaut
Y. Liu
Ö. Şimşek
Ö. Şimşek
Ö. Şimşek
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

G89-989 How to Interpret the New Animal Model for Dairy Sire Evaluation

Author: C. Guestrin
J. Pearl
J. Yedidia
J.R. Kok
J.R. Kok
M. Wainwright
N.L. Zhang
U. Bertelé
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 01/01/1989
Field of study

In question and answer format this NebGuide addresses changes in genetic evaluations of both dairy cows and sores. Why is the United States Department of Agriculture (USDA) changing the dairy sire and cow evaluation system? The answer is simple. The Animal Model for genetic evaluations is more accurate than the old Modified Contemporary Comparison Method (M.C.C.). Previously the major limiting factors to implementing the Animal Model were computing costs and memory requirements. With the advent of new Super Computers, the computations are feasible on a national scale. What is the Animal Model? The Animal Model simultaneously evaluates cows and sires using all their ancestor relationships. This means that every animal known in a given pedigree is used to evaluate both the cow and sire. This increases the accuracy of evaluation and should be a major step in breeder acceptance of the new evaluation system. Not only are all registered animal pedigrees included in the evaluation process, but all non-registered cattle pedigrees that have been identified properly on Dairy Herd Improvement (DHI) testing are included

Crossref

DigitalCommons@University of Nebraska

Open Repository and Bibliography - Luxembourg

Exploiting Additive Structure in Factored MDPs for Reinforcement Learning

Author: B. Breiman
C. Boutilier
C. Guestrin
J. Hoey
P. Schweitzer
T. Dean
Publication venue
Publication date: 01/01/2008
Field of study

Abstract. sdyna is a framework able to address large, discrete and stochastic reinforcement learning problems. It incrementally learns a fmdp representing the problem to solve while using fmdp planning techniques to build an efficient policy. spiti, an instantiation of sdyna, uses a planning method based on dynamic programming which cannot exploit the additive structure of a fmdp. In this paper, we present two new instantiations of sdyna, namely ulp and unatlp, using a linear programming based planning method that can exploit the additive structure of a fmdp and address problems out of reach of spiti.

CiteSeerX

Crossref